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I calculate a hybrid cross-power spectrum estimator from the WMAP 5-year CMB temperature 
maps, discuss the goodness of fit, and then constrain cosmological parameters. The spectrum 
and results are generally consistent with previous results, though the power spectrum error bars 
are slightly smaller and there are small shifts at high /. The small improvement in error bars is 
obtained at very low numerical cost but does not significantly improve parameter constraints. I 
discuss the accuracy of the likelihood model and how constraints on the optical depth translate 
into constraints on the reionization history allowing for helium reionization. In the appendices I 
propose a simple reionization parameterization that determines the history in terms of a mid-point 
reionization redshift, and suggest a new likelihood approximation for chi-squared-like distributions 
with varying skewness. 



I. INTRODUCTION 

With five years of data the WMAP satelhte can measure the microwave sky down to sub-degree scales [l| , providing 
accurate measurements of the angularpower spectrum and large scale polarization signal, and hence constraining 
many cosmological parameters [J H, |J. In this paper I analyse the temperature maps provided by the WMAP 
team, compressing the information into a set of power spectrum estimators, and then investigate the constraints on 
cosmological parameters. I use a hybrid cross-map power spectrum estimator that has slightly lower noise than the 
one used by the WMAP team. The effect on the cosmological parameters is modest, and the final parameter values 
are very consistent with the WMAP 5-year results; Fm submitting this paper to help avoid any publication bias 
towards results that happen to drift in a particular direction. 

Note that the scope of this paper is limited to a re-analysis of the foreground-cleaned temperature maps. I do 
not attempt to study the polarization signal or foreground removal. Details of the large scale polarization signal 
also impact parameters via the constraint on the optical depth, but otherwise all the interesting information about 
standard model parameters is in the temperature spectrum which is the easy part to analyse and the focus of this 
paper. I shall assume standard isotropic Gaussian cosmological models, that the WMAP likelihood code provides 
accurate results for the power spectrum at I < 30, the foreground model is correct, pixel noise is Gaussian and 
independent between maps and pixels, and that the beam model is accurate. This last assumption is perhaps the 
greatest for analysis of the temperature spectrum, with results depending critically on knowing the beam transfer 
function to sub-percent level. However the WMAP team have now studied the beams in great detail, and claim 
remaining uncertainties well below one percent [F| . Uncertainties due to pixelization are hard to quantify other than 
by doing many full timestream simulations, and potentially important if the effect on the window function correction 
for pixel averaging is above a few percent. However consistency between ecliptic and galactic maps suggests that as 
expected these effects are small. The point source contribution to the spectrum is determined at the 10%-level in the 
five-year results, which is small enough for uncertainties to have only a minor effect on parameter constraints Q; I 
assume the same model. 



II. POWER SPECTRUM ESTIMATION 



The general philosophy of most methods of small scale CMB data analysis is to first compress the data into a 
set of power spectrum estimators, then calculate the parameter likelihood given these estimators [l,0i[li0- Doing 
this potentially makes the analysis very fast, especially compared to a numerically very expensive direct likelihood 
evaluation from the timestream or maps. Depending on the choice of estimator, the amount of information lost by 
doing the compression can vary considerably. There is an extensive literature on different estimators, and like the 
WMAP team I shall concentrate on pseudo-C; estimators 0| • These are especially fast to compute, allowing for 
extensive testing and Monte Carlo simulation for assessing errors. 
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The simplest Pseudo-C; estimator is constructed by taking some weight function VF(ri) on the sky, and then 
calculating the power spectrum of W{n)T{il.), where T(fJ) is the CMB temperature at position f2. The window 
function can be chosen to be zero over regions where the signal is not reliable (such as the galaxy and resolved point 
sources), but can otherwise be chosen in various ways. The expectation of the pseudo power spectrum C is related 
to the true power spectrum C by some coupling matrix M and a noise power spectrum Ni, 

(Q) =^MirQ' +iVz. (1) 
I' 

Hence by inverting M one can construct unbiased pseudo-C; power spectrum estimators C = M^^{C — N). In 
practice there is more than one map, and different maps may have different noise and beam smoothing. Analysing a 
set of maps therefore requires some generalization. Methods for combining pseudo-C; estimators from different maps 
are well known and straightforward simply use the covariance between the estimators to work out a single 

combined minimum- variance estimator. This was done with the first-year WMAP results, using three different window 
functions VF(ri) for large, intermediate, and small scales; however different weight functions were not combined at 
any given In the 5-year analysis the window function was taken to be uniform at / < 500, and inverse-noise on 
smaller scales; this choice is optimal at both very large and very small scales, but suboptimal on intermediate scales 
where the size of the error bars matters most. By combining weight functions smaller error bars can be obtained 
with essentially no extra computational effort p^ . in addition to giving aesthetically more pleasing error bars that 
are continuous in /. 

In general one can define a set of weight functions W^''\fl), and apply them to the set of available (foreground- 
cleaned) maps Ta{i^) to give the set of pseudo power spectra 

m 

where 

5« ^ J dnT^{n)w('HmLm, (3) 

and Yijn is a spherical harmonic. The covariance of these estimators can be calculated approximately analytically, as 
given explicitly in the appendix of [11] for single maps. 

When there there are multiple maps it is often an excellent approximation to assume their noise is uncorrelated. 
In the case of the WMAP 5-year data there are 4 detectors^ in W-band and 2 in V-band, giving a total of (4 + 2) x 
5 = 30 independent maps (other channels are not used as the beams are larger and there is significant foreground 
contamination). Each detector is assumed to have the same effective beam for each year of observation, though the 
noise varies slightly between years (for example due to different data cuts for planets, etc.). Given a set of estimators 
{^i^bc} optimal analysis would use all of them, using all prior information about the noise to model noise biases. 
However if there is significant uncertainty in the noise in each map, marginalizing out this uncertainty essentially 
down weights all spectra involving only one map. The WMAP team therefore sensibly just throw out all estimators 
with a = (5, giving a set of cross-map estimators that have zero noise bias: 

(1 - <5„^)(CS^) = (1 - 6^p){b-%\M^=]-^c'^:l]) = (1 - <5„^)C. (4) 

The coupling matrix M^^^^ now depends on the two weight functions, and the transfer functions bi take out the 
smoothing on each map due to the beam and pixelization (I assume the effective beam and pixel transfer function is 
the same over the masked map as over the full sky) . If there are n maps, the maximum increase in error bar from 
using only cross-spectra is ~ l/2n, so for 30 maps the loss is at the percent level: a small price to pay for cleanly 
removing most systematic errors relating to noise bias. 

It remains to choose the weight functions {ly^*'' (il)}. On large scales the noise is very small, and errors are limited 
by cosmic variance, and by symmetry the optimal weighting must be uniform on the full sky. On small scales noise 
dominates, and the weight function should give more weight to parts of the sky observed more often (and hence 
having lower noise). Inverse-noise weighting is the optimal choice for a single map [l3|. I also use the 'KQ85' mask 
to remove foreground contamination (setting to zero ^ 15% of the weight map); this is smoothed on a 1/3-degree 



^ I use 'detector' as a shorthand for 'differencing assembly' 
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FIG. 1: The hybrid power spectrum (blue) compared with the WMAP estimators (red). The top panel is binned with A; = 31, 
and error bars show ilcr error bars slightly offset for clarity; the dash dotted line is A'^;^^. The lower panel shows the estimators 
at each I, with the lines showing 2a symmetric posterior diagonal errors from the assumed fiducial model; the slightly broader 
dashed lines are the corresponding diagonal errors from the WMAP likelihood code. 



scale to reduce /-couplings without significantly degrading the result.^ I therefore choose to use two weight functions, 
"KQ85 X uniform" and "KQ85 x inverse noise" , where the 'inverse-noise' map is calculated from a smoothed combined 
map. An optimal combination of estimators using these weights should smoothly interpolate between small and large 
scales where the weight functions are individually close to optimal. For further discussion of how to optimally choose 
a weight function for a Cj-estimator at a particular I from one map see Ref. [in- 
putting this all together the hybrid cross-estimator is just a particular linear combination of all the separate 
estimators: 

C=E^«)^^(1-Mci^?- (5) 

ijap 

With n^^ distinct estimators and ni different /-values, calculation of Hj^^^ requires inversion of the full [ni x n^^]"^ 
covariance matrix, which becomes computationally tedious or even difficult if there are many maps. Clearly there is 
no point in using an approximate method that is becoming as expensive as doing a proper lossless likelihood analysis. 
One option would be to neglect some of the very small off-diagonal correlations. Here, instead of calculating the full 
result for the full set of maps, I find the optimal mixing for combined-channel maps. Since the noise is very similar 



^ Since the CMB power spectrum falls rapidly at / > 200, mode coupling tends to increase the variance at a given I by mixing in large-scale 
modes with larger variance. Apodizing improves the accuracy of the approximations used to estimate the C; covariance, with further 
improvements to the few-percent level obtained by first smoothing the fiducial power spectrum on a scale appropriate to the weight 
function. 
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between years, and the beams for each detector in each band are quite similar, the dominant variation of the hybrid 
matrix comes from the weight and band index, rather than variations between years or detectors. Since there are 
many maps the difference between cross and auto-power spectrum estimator covariances is small. I then only need 
to calculate the covariance of C^'^ , where A and B label some combination of the V and W bands (giving a total 
of 10 distinct power spectrum estimators). The combined maps can be calculated easily by co-adding the maps from 
different years'^, and then summing the maps from the different detectors. This last step is slightly suboptimal due 
to the differences in beams and noise between detectors; in fact for WMAP this suboptimality is almost the same 
as that from using cross-spectrum estimators, with the result that the hybrid estimator covariance calculated from 
combined-map Monte Carlo agrees at the sub-percent level (on the diagonals) with that from the hybrid estimator 
applied to co- added cross-spectra (see Appendix [X)) . 

Rather than taxing the index-tolerance of the reader any further by explicitly writing down how the full combined- 
map covariance (and hence -ff^^') is computed, I simply refer to Refs. [1, [l^jliil from which the result is a straight- 
forward generalization. Once the hybrid mixing matrix is computed, one full map-set simulation and cross-power 
spectrum estimation with two weight maps takes about three minutes over 8 processors at 'res 10' (12 x 1024^ 
HEALPix pixels) and using l^^x = 1100. Simulation and analysis of combined-map hybrid estimators is significantly 
faster since only four weighted maps have to be spherical-transformed (two each in the V and W bands) rather than 
60. The full cross-spectrum analysis of the WMAP maps involves estimating the noise, calculating combined maps, 
generating the mixing matrix and covariance for all the estimators, forming the combined-map hybrid mixing matrix, 
calculating the hybrid estimator covariance, and then calculating the hybrid cross-power spectrum from the map data; 
all this takes around 20 minutes on 8 processors, making it fast enough to easily test the effect of changing various 
aspects of the analysis. The numerical code is available for public inspection and use^, and the spectrum from the 
foreground-cleaned WMAP 5-year maps^ is shown in Fig. [T] 

The power spectrum is very consistent with the WMAP analysis where it is cosmic variance limited. Once the 
noise becomes significant the hybrid estimator performs slightly better. The error bars in this analysis remain slightly 
smaller to small scales, and it also looks as though the outliers at high / are more consistent with the error bars in 
this analysis than the official WMAP 5-year spectrum (e.g. see the group of points around I — 900). The WMAP 
points are consistently slightly higher in the high-Z noise dominated region, though the reason for this is not entirely 
clear. Differences in the very noise-dominated region have little effect on standard parameter constraints because the 
error bars are large compared to the range of models that fit the first two acoustic peaks. 



III. CALCULATING THE LIKELIHOOD 

Having compressed the map data into a vector of estimators C, the next step is to calculate the likelihood £{C{d)\C), 
where C(0) is the theoretical power spectrum from a set of cosmological parameters 9. This step is again computa- 
tionally prohibitive to do exactly, so some approximation is required — for a detailed recent discussion of possibilities 
see Ref. [ll| . The approximation used by the WMAP likelihood code is described in Ref. [l^ and the latest treatment 
of beam and point source uncertainties in Refs. For the low-? spectrum {I < 32) I use the WMAP likelihood 

code as supplied, simply treating the high-Z temperature likelihood as an independent dataset. To isolate changes 
due to the new temperature map analysis I include the high-/ temperate-polarization correlation spectrum using the 
original WMAP likelihood code. 



For the high-/ temperature likelihood I use the new approximation suggested in Ref. jll| : 



- 21og£(C|C) (It^) (^/^' + [^^7^]"' (^/^'' + 9 (fr^j ' (6) 

where Mf is the estimator covariance evaluated for a fiducial model C/_;, N^^ is an /-dependent parameter to be 
chosen, and g{x) = sign(a; — l)y^2{x — log(a;) — 1) accounts for the skewness of the chi-squared-like distribution of 
the C estimators. For a consistent choice of Nj^^ and the same covariance this gives almost exactly the results as the 



^ The word 'co-add' appears to be specific to the cosmological data analysis community. I use a 'co-added' set of vectors {v^'^} to mean 

: a^^'^v^^\ where the weights a^'^ are chosen to approximately minimize the noise on Wj. 
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FIG. 2: Diagonal contributions to the likelihood from each I of the observed cross-spectrum Ci. The lines show the smoothed 
rms, 0.01 and 0.001 confidence limits estimated from simulations of the combined map spectra (bottom to top). Points move 
around a bit depending on which model is used to calculate the contribution to the likelihood. 



WMAP likelihood approximation for the statistical errors (though a slightly different Xos)- ^ ^^.ke 

iVr-(V<^"VA4'^-l)c^/,. (7) 

where Miif~^^^ is the estimator covariance and M^f^ is the covariance when there is no noise, both evaluated at the 
fiducial model C/.;. For this purpose the covariances are calculated using the analytic approximations; if Nj^^ is only 
used in Eq. [5] accuracy is not critical. This choice of N"^^ is consistent with the WMAP likelihood model where the 
diagonals of the covariance are taken to scale with the noise as (x (C; + Nf^)'^. For further discussion and tests of the 
accuracy of the likelihood model see Appendix [X] 

Assuming point sources behave like an isotropic gaussian random field with a white spectrum, their contribution 
can be added to Cf,i and Ci when calculating the likelihood (with their cosmic variance contribution included in 
Mf). I use a fiducial unresolved point source power spectrum with amplitude Aps = 0.011/iK^ (following Ref. [3(, 
with spectral index parameter a = 0). I calculate the point source contribution Cp^i to the C'l by combining the 
assumed white-noise spectra for each frequency appropriately over the hybrid matrix. The estimators shown in Fig. [T] 
have had this contribution subtracted. For further discussion of the point source contribution see Refs. [l^, [l^. My 
treatment of point source and beam uncertainties is described in the next section. 

The effective chi-squared XcS (defined by Eq. ([5])) of the power spectrum estimators is consistent with simulations, 
giving XcS — 936 (reduced value 1.08) using 33 < ^ < 900 for the fiducial model. This compares to XcS = 882 ± 44 in 
simulations of the fiducial model, where all values are computed using the analytic approximation for the covariance 
and zero SZ contribution. The observed value is a little high, but this should not be surprising since the fiducial model 
is expected to be somewhat off the true model, and there are also the beam and point source uncertainties. In any 
case the value is consistent at about the 1-sigma level. Consistency of XcS values is only a very minimal consistency 
check: individual Ci could still vary significantly from the expected distribution. However Fig. [2] shows that although 
there are apparent outliers in the spectrum, they only occur with a frequency about that expected by comparison 
with simulations. In the absence of any immediate evidence for deviations from the assumed theoretical framework 
on the relevant scales, I proceed to estimate parameters using the likelihood approximation described above. 
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FIG. 3: Marginalized parameter constraints of this paper (solid lines) compared to using the WMAP likelihood code with 
the spectrum and covariance from this paper (dashed) and the result from the original WMAP likelihood code and spectrum 
(dot-dashed). The bottom five parameters (except Asz) are derived from the other parameters. 



IV. PARAMETER ESTIMATION 



Using the power spectrum estimators and model for the likelihood function, cosmological parameters can be sampled 
using standard Markov chain Monte Carlo methods. I use the CosmoMC^ code which uses CAMB [l^ to calculate 
the theoretical power spectra, and restrict to vanilla flat ACDM models with six parameters of interest: reionization 
optical depth r, dark matter density flch^, baryon density i^bh^, amplitude of the primordial adiabatic curvature 
perturbation power spectrum Ag at k ^ 0.05Mpc~^ (with flat prior on logAg), constant spectral index n^, and 
angular scale measured by ^ — approximately 100 times the ratio of the sound horizon to the angular diameter 
distance at last scattering. Other parameters such as the Hubble expansion rate Hq today can be derived from 
these. I include the effect of CMB lensing and also marginalize over a SZ template amplitude Asz as in the WMAP 
papers 0, ^ generate six chains, stopping when the worst rms difference in 95% parameter confidence limits 
between chains is a small fraction of the error bar. 

In addition to the cosmological parameters and Asz I include a point source spectrum amplitude parameter to 
account for the 10% uncertainty in the point source power spectrum amplitude. Accounting for beam uncertainties 
properly is more difficult without knowing the uncertainty for each detector. However the beams are now so accurately 
modelled that beam uncertainties are not a large effect; I therefore simply take the 10 beam uncertainty modes 
identified by the WMAP team and assume that they also apply to the hybrid power spectrum. I add mode amplitude 
parameters to the Markov chain with Gaussian priors, including the effect of the beam modes on the theoretical 
power spectrum at each point in parameter space. In practice only the most important modes have a significant 
effect, so I include the top five of the ten modes as Monte Carlo parameters, and add the remaining five modes as tiny 
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non-diagonal terms in the covariance matrix. This gives essentially identical results to including all 10 beam modes 
in the chain. In total the Markov chain includes thirteen parameters: six cosmological parameters of most interest, 
Asz, one point source amplitude parameter and five beam modes. The nuisance parameters are prior driven, so their 
posteriors are similar to their priors. 

To calculate the theoretical power spectrum from the set of cosmological parameters some shape of the reionization 
history has to be assumed. Many different histories can give the same optical depth. I parameterize the history by 
a redshift parameter Zj-e and a fixed width, see Appendix [Bl for details. In this parameterization Zre measures where 
the ionization fraction was half of its maximum. The maximum fraction is about ~ 1.08 electrons per hydrogen 
atom assuming that hydrogen reionizes in roughly the same was as the first reionization of helium. The WMAP 
team used a different definition of z^e — relating it to a sharp reionization model with Xe = 1 at z < z^e — so my 
values for this parameter differ by about 6% from the WMAP 5-year results. Although this is primarily a difference 
of definition, results also differ slightly because of the slightly different large-scale iJ-polarization signal when the 
reionization history changes. The new reionization parameterization has been included in CAMB'' since the March 
2008 version. For a more detailed investigation of details of the reionization history see Ref. p^ . 

Figure [3] shows marginalized parameter constraints compared to those obtained from the WMAP likelihood code. 
The spectral index constraint is Us = 0.966 ± 0.015 (quoting 1-sigma errors), and as before = 1 is ruled out at 
just over 2-sigma. The dark matter density has shifted slightly lower, with f2c^^ = 0.106 ± 0.006. Other parameter 
values are also broadly consistent, with slight shifts to as — 0.780 ±0.036 and flm — 0.24 ±0.03, and well constrained 
combination ^l.^h^'^Ug'^'^^ — 0.125±0.002. The redshift of reionization in the new parameterization is z^e = 10.5±1.4, 
with a 2-sigma lower limit of Zre > 7.8. Extending the analysis to include a running of the spectral index rirun gives 
JT-run = -0.037 ± 0.027, very consistent with the WMAP result. 

There is a slight shift in poorly determined parameters like rim and the Hubble parameters apparent in Fig. [3] when 
using the new spectrum, N^^, covariance and point source spectrum in the WMAP likelihood code compared to the 
likelihood analysis described here. The reason for this is not entirely obvious, with a combination of small effects 
from different beam and point source uncertainty modelling, different likelihood approximation, and approximations 
used in the WMAP code. The shift may be a good indicator of the likely range of systematic error in the results due 
to likelihood modelling. These parameters are in any case somewhat sensitive to the choice of priors. The systematic 
error due to the low-/ likelihood can be estimated from Ref. 0, where changes in the polarization analysis shift the 
optical depth by ~ 0.01, comparable to the effect of reionization history modelling [l^: I have not accounted for these 
uncertainties in my results. 

V. CONCLUSIONS 

I presented a re-analysis of the 5-year WMAP temperature maps. This was certainly not independent of the analysis 
done by the WMAP team, however it is very reassuring that the results are so consistent. Using an improved power 
spectrum estimator can reduce the information loss when compressing the sky maps into a set of estimators, but the 
ultimate parameter constraints are very similar. Future experiments such as the Planck satellite will have a much 
more anisotropic noise distribution, making the gain from using hybrid- like estimators much more significant. More 
optimal estimators may do even better and be worth calculating if the likelihood function for the estimators can still 
be calculated reliably. 

The general methodology applied in this paper extends straightforwardly to small-scale polarization analysis as 
discussed and proven for single maps in Ref. [ll| . I have not attempted a reanalysis of the WMAP T-E power spectrum 
here since it contains only a modest amount of information on smaller scales, though using a better estimator might 
shrink the error bars slightly. The new likelihood parameterization would also allow the / — I and T-E correlations to 
be accounted for more consistently than in the current WMAP likelihood code. In the future a full consistent analysis 
of temperature and polarization will be much more worthwhile. 

This analysis was possible because the WMAP team have made their data available. As a further step towards 
reproducibility my numerical code is publicly available, so my power spectrum results can be exactly reproduced from 
the data supplied by the WMAP team. 
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FIG. 4: The fractional difference in diagonal error as a function of I (i.e. the square root of the diagonal of the Ci covariance) 
compared to that from 1500 simulations of the hybrid cross-spectrum estimator. The solid line shows the analytic approximation 
for the combined-map hybrid spectrum (having smoothed the input power spectrum with A; ~ 25 for I > 200). The dashed 
line is from 5000 simulations of the combined-map hybrid estimator. The dash-dotted line shows the result used by the WMAP 
likelihood code supplied on LAMBDA (which is discontinuous at I = 500 where they switch from uniform to inverse-noise 
weighting). Simulation results are smoothed over A; = 20 to reduce sampling noise. The analytic result is accurate on small 
scales because the noise dominates. 
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APPENDIX A: ACCURACY OF LIKELIHOOD MODEL 

To model the likelihood accurately the approximation of Eq. ^ needs to be reliable, and the covariance (and 
effective noise) have to be sufficiently accurate. 

Construction of the hybrid mixing matrix is done using analytic approximations for the covariances, which are 
accurate at the < 20%-level, depending on the degree of apodization of the mask used. The main inaccuracy is in 
the signal part of the covariance, where it is effectively assumed that C; are constant over the range of I that are 
coupled by the sky cut and weighting. The noise can be calculated accurately from the noise maps. This of course 
assumes the assumption of uncorrelated Gaussian noise is correct, and that noise variance is well known. I estimate 
the pixel-noise variance directly from the maps by assuming that the maps for different years differ only in their 
independent noise realizations. There is some evidence for drifts and variations in the noise variance at the 0.8% 
level; this would be enough to be slightly worrying if including auto-spectra in the analysis, but when using only 
cross-spectra mis-estimation only has a very small effect on the error bars. Calculating cross-spectra from linear 
combinations of year maps that should be independent of the CMB does give a result consistent with pure noise 
simulations. 

Figure[5]compares the diagonal values of the covariance for the various different cases and compares with the WMAP 
5-year likelihood model. The diagonal cross-spectra and combined-map covariances agree very well, and agree with 
the analytic result to < 10%. The WMAP error bars are consistently larger than from the hybrid estimators as 
expected when the signal and noise are both significant. It is not entirely clear why the results do not converge at 
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FIG. 5: The skewness /[n^'^'^f/'^) as a fraction of that expected from a reduced chi-squared distribution on the full sky. 
The sohd hues is the prediction for the hybrid estimator based on assuming a reduced chi-squared distribution determined by 
the variance and degrees of freedom. Points are binned estimates from 10000 simulations with error bars estimated from the 
values in each bin. The dashed hue is a possible improved model with a = 1 ~ f^^y 



high though inspection of the point-source power spectrum amplitudes suggests the WMAP mixing matrix is using 
a slightly different mix of W and V from the hybrid estimator. 

In addition to having the covariance accurately, there is also a question about whether the likelihood model is 
reliable. If the non-Gaussianity of the estimator distributions is not correctly modelled it could potentially give 
parameter biases as well as giving wrong error bars. The likelihood approximation of Eq. ^ is essentially treating 
each Cj-estimator as though it had a (reduced) chi-squared distribution with some degrees of freedom vi. 



As discussed in Ref. [llj, the correctness of the statistical error bars can be checked using binning: with wider 
bins the distributions become more Gaussian by the central limit theorem. Since most theoretical models give very 
smooth power spectrum (e.g. accurately recovered by splining points separated by A/ = 50 on small scales), the 
information lost by binning with A; <C 50 is very small. I calculated chains using a fiducial Gaussian likelihood model 
and binned spectra with A/ = 10. In this limit beam and point source uncertainty modes can just be added to the 
covariance as terms of the form C„iC^ for each mode C^. Parameter constraints agree very well with those from 
un-binned estimators and including the beam and point source parameters in the Markov Chain. This provides some 
confirmation that the likelihood model is consistent. However the analysis does rely on the correct beam modes being 
identified; since I have only treated this approximately the WMAP analysis should be more reliable in this respect. A 
more sophisticated treatment could also account for the non-Gaussian and non-isotropic distribution of point sources. 



For the likelihood approximation to be good at each I the estimators need to have a distribution that is a reduced 
chi-squared with some vi degrees of freedom. This is a very specific distribution where the skewness is fixed by the 
variance and effective degrees of freedom. Since the skewness is related to likelihood bias, any difference in skewness 
leads to errors in the likelihood approximation at each I. As shown in Ref. [ll| accuracy at each I is not actually 
required for reliable parameter constraints: differences between / tend to average out in many cases. For example 
using a Gaussian approximation is actually reliable despite being a poor model at each I. Above we have confirmed 
that the likelihood approximation gives reliable parameter constraints for simple models, but in general it would be 
nice to have an approximation that is reliable Z-by— Z, for example when considering strange models with glitches in 
the spectrum. 

Figure [5] shows the skewness estimated from simulations. The skewness is significantly lower than expected from 
the estimator variance and effective degrees of freedom v estimated by taking the diagonal covariance to be 2Cf/v 
when there is no noise. Expected small differences between auto- and cross-spectra [ll| cannot be resolved without 
doing many more simulations. This motivates finding a likelihood approximation that can use a more accurate model 
of the skewness. 
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On the full sky with no noise the maximum likelihood estimators are Ci = J2m kimP/(2? + l), which have a reduced 
chi-squared distribution, so 



2P{Ci\Ci)^iy Ci/Ci-\og{Ci/Ci) - z/logW2) + 21og [r(z./2)] + 2 log(Q: 



(Al) 



where z/ = 2Z + 1 is the degrees of freedom. With isotropic noise the same result holds with Ci ^ Ci + Ni and 
Ci ^ Ci + Ni since the noise is also assumed to be Gaussian. This motivates generally considering whether 



2P{Ci\Ci) ^ i^,B 



^tot 



log 



^tot 



(1 + a)Ct°* 1(1 + a)C*°t 



+ 21og(Q*°* 

- iy,s log(z^cff/2) + 2 log [T{v,s/2)] , (A2) 



where C'°* = Ci + N^^, (7*°' = C + Ni^. This has (Ci) — Ci as required for unbiased estimators, and and a 
(generally both functions of I) could be obtained from the second and third moments. Defining ly = i^cs/ (1 + «)^ these 
are 



,(2) - 



,.(3) - 



HQ - Qf) = 



2(Cf^ 

8(C;*°*)3 
(1 + 



(A3) 
(A4) 



The parameter a controls the skewness relative to that expected for a chi-squared distribution with i> degrees of 
freedom. The limit a corresponds to the reduced chi-squared distribution, and a ~* oo gives 



2PiCt\Ct) 



y{Ci - Cif 

2[Ctot]2 



21og(C;*°*) -hconst, 



(A5) 



i.e. a Gaussian. Gonsidered as a likelihood function the maximum is not at C; = Ci unless a = 0: assuming i/ ^ a it 
is instead at « Ci\l — 2a/v{l -I- a) -|- . . . ]. Using Eq. (|A2[) to fit for a from simulations gives values from about 0.15 
on large scales to 0.35 on small scales, consistent with Fig. [5l 

Generalizing Eq. (|A2p to the cut sky suggests the likelihood approximation 



2/:(c|c)«^, 



^tot 



{l + m)CY 



(1 



-ai)Cl°'[M-^]wCl?\l + ai.)g 



(1 



(1 



(A6) 



where g{x) = sign(a; — l)y/2{x — log(a;) — 1). Here terms independent of C/ have been changed such that the nor- 
malization £(C|C) — (which is not in general the maximum likelihood C). The variable Nf^ is defined so that 
C;*°*[iVf~^]n'C*;°* is almost independent of C. This could be generalized for polarization following the vectorization 
method of Ref. [U. 

Using this likelihood approximation using the crude fitting a; = 1 — f^^^i (with f^^yi = v/{2l + 1)) gives virtually 
identical results for the parameters to Eq. ([6]). This should be no surprise since using a Gaussian approximation also 
gives very similar results (as expected in typical realizations). If there were strongly /-dependent skewness parameters 
ai not resolved with a small number of simulations the effect could be larger. The approximation may also be useful 
at low I where Ci + Nf^ has a significant probability to be negative using Pseudo-C; or quadratic maximum likelihood 
estimators. 



APPENDIX B: REIONIZATION PARAMETERIZATION 



The optical depth to reionization is defined by 



■no 



d77a<°'™aT, 



(Bl) 
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FIG. 6: Three recombination histories aU with r = 0.09. The dashed hne is the model typicaUy used by CMBFAST and CAMB 
prior to March 2008 with f — 1. The black line is the new model with — 0.5, the red line with — 1.5. 



where is the number density of free electrons produced by reionization at conformal time rj, rjo is the conformal 

time today, ax is the Thomson scattering cross-section, and a is the scale factor. The total number density of free 
electrons is slightly different from n™'"" because there is a small residual ionization fraction from recombination. At 
the level of precision required this can be neglected (< 10~^), though CAMB keeps the ionization history smooth my 
mapping smoothly onto the recombination-residual. 

Since Tie oc (1 -I- z)^Xe{z), where Xe is the number of free electrons per hydrogen atom, and using the fact that 
reionization is expected to happen during matter domination. 



(X / dz XeV^ + z on / d[(l -I- z) 



n3/2i 



(B2) 



It is therefore handy to parameterize function of y = (1 + z)^''^. 

terization uses a tanh-bascd fitting function 



My) = ^ 



1 + tanh 



y - y{Zre) 



As of March 2008 CAMB's defauh parame- 



(B3) 



where y{zre) = {1 + ZreY^'^ is where Xe — f /2: i.e. Zre measures where the reionization fraction is half of its maximum. 
In other words, with this parameterization the optical depth agrees with that for an instantaneous reionization model 
at the same Zre for all (matter-dominated) values of Aj,. Except in early dark energy models this result is quite 
accurate for the expected range of z^e- In practice the input parameter is and Ay is taken to be 1.5-^1 -I- Zr-eA^. 

If hydrogen fully ionizes / = 1. However the first ionization energy of helium is similar, and it is often assumed 
that helium first re-ionizes in roughly the same way. This is indeed seems to be the case in numerical simulations 
(see e.g. Ref. f^Tl). In this case / = 1 -I- fne: where fne = ^Hc/t^h is easily calculated from the input helium mass 
fraction Yne- This is CAMB's default value of /; typically / 1.08. 

In addition dX z ^ 3.5 helium probably gets doubly ionized. Due to the low redshift this only affects the optical 
depth by ~ 0.001, but for completeness this is included using a fixed tanh-like fitting function (modifying the above 
result for r appropriately). Some reionization histories are shown in Fig.[6l 
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